Mining Actionable Subspace Clusters in Sequential Data

نویسندگان

  • Kelvin Sim
  • Ardian Kristanto Poernomo
  • Vivekanand Gopalkrishnan
چکیده

Extraction of knowledge from data and using it for decision making is vital in various real-world problems, particularly in the financial domain. We identify several financial problems, which require the mining of actionable subspaces defined by objects and attributes over a sequence of time. These subspaces are actionable in the sense that they have the ability to suggest profitable action for the decision-makers. We propose to mine actionable subspace clusters from sequential data, which are subspaces with high and correlated utilities. To efficiently mine them, we propose a framework MASC (Mining Actionable Subspace Clusters), which is a hybrid of numerical optimization, principal component analysis and frequent itemset mining. We conduct a wide range of experiments to demonstrate the actionability of the clusters and the robustness of our framework MASC. We show that our clustering results are not sensitive to the framework parameters and full recovery of embedded clusters in synthetic data is possible. In our case-study, we show that clusters with higher utilities correspond to higher actionability, and we are able to use our clusters to perform better than one of the most famous value investment strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Actionable 3D Subspace Clustering Based on Optimal Centroids

An efficient Actionable 3D Subspace Clustering based on Optimal Centroids from continuous valued data represented three dimensionally which is suitable for real world problems profitable stocks discovery , biologically significant protein residues etc. It achieves actionable patterns ,incorporation of domain knowledge which allows users to choose the preferred utility(profit/benefit) function, ...

متن کامل

Integration of Subspace Clustering and Action Detection on Financial Data

Object, attribute and context information are linked in the dimensional data models. Cluster quality is decided with domain knowledge and parameter setting requirements. CAT Seeker is a centroidbased actionable D subspace clustering framework. CAT Seeker framework is used to find profitable actions. Singular value decomposition, numerical optimization and D frequent itemset mining methods are i...

متن کامل

Exploring Constraints Inconsistence for Value Decomposition and Dimension Selection Using Subspace Clustering

The datasets which are in the form of object-attribute-time is referred to as threedimensional (3D) data sets. As there are many timestamps in 3D datasets, it is very difficult to cluster. So a subspace clustering method is applied to cluster 3D data sets. Existing algorithms are inadequate to solve this clustering problem. Most of them are not actionable (ability to suggest profitable or benef...

متن کامل

Less is More: Non-Redundant Subspace Clustering

Clustering is an important data mining task for grouping similar objects. In high dimensional data, however, effects attributed to the “curse of dimensionality”, render clustering in high dimensional data meaningless. Due to this, recent years have seen research on subspace clustering which searches for clusters in relevant subspace projections of high dimensional data. As the number of possibl...

متن کامل

Select actionable positive or negative sequential patterns

Negative sequential patterns (NSP) refer to sequences with non-occurring and occurring items, and can play an irreplaceable role in understanding and addressing many business applications. However, some problems occur after mining NSP, the most urgent one of which is how to select the actionable positive or negative sequential patterns. This is due to the following factors: 1) positive sequenti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010